Skip to content

Fix integer division by zero crash in CPU EP Div operator#27693

Merged
tianleiwu merged 5 commits intomainfrom
copilot/fix-int8-division-by-zero
Mar 19, 2026
Merged

Fix integer division by zero crash in CPU EP Div operator#27693
tianleiwu merged 5 commits intomainfrom
copilot/fix-int8-division-by-zero

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 17, 2026

Description

Add a pre-check for zero values in the divisor tensor for integral types in Div<T>. Returns an error Status instead of hitting undefined behavior (SIGFPE / structured exception).

  • element_wise_ops.h: When the divisor is a constant initializer, TryGetConstantInput validates for zeros once at kernel creation time in the constructor, avoiding per-Compute overhead. A divisor_is_validated_constant_ flag tracks whether the one-time check was performed.
  • element_wise_ops.cc: if constexpr (std::is_integral<T>::value) guard scans non-constant divisors before calling UntypedBroadcastTwo, skipping the check when the constant was already validated. Compiled away for float/double/half — zero cost for non-integer paths.
  • element_wise_ops_test.cc: Added Div_int8_by_zero, Div_int32_by_zero, Div_int64_by_zero_scalar tests covering tensor and scalar divisor cases, plus Div_int32_by_zero_constant_initializer to exercise the TryGetConstantInput constructor path with is_initializer = true.

Motivation and Context

Integer division by zero is UB in C++ and causes a hardware exception that crashes the process. Float types produce inf/NaN naturally, but int8/int16/int32/int64/uint* types do not. This was reported via Chromium (https://issues.chromium.org/issues/491835014) with a trivial repro: tensor<int8> / scalar(0).

Original prompt

This section details on the original issue you should resolve

<issue_title>int8 / 0 exception not caught for cpu ep</issue_title>
<issue_description>See https://issues.chromium.org/issues/491835014.

Repro:
a=tensor
b=tensor, ie a scalar that is 0
model that does a/b

Stack trace:

onnxruntime.dll!Eigen::internal::scalar_quotient_op<signed char,signed char>::operator()(const char &) Line 437      C++
      [Inline Frame] onnxruntime.dll!Eigen::internal::binary_evaluator<Eigen::CwiseBinaryOp<Eigen::internal::scalar_quotient_op<signed char,signed char>,Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<signed char>,Eigen::Array<signed char,-1,1,0,-1,1> const> const ,Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1> const ,0,Eigen::Stride<0,0>>> const>,Eigen::internal::IndexBased,Eigen::internal::IndexBased,signed char,signed char>::coeff(__int64) Line 910    C++
 ...
      [Inline Frame] onnxruntime.dll!Eigen::internal::Assignment<Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>>,Eigen::CwiseBinaryOp<Eigen::internal::scalar_quotient_op<signed char,signed char>,Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<signed char>,Eigen::Array<signed char,-1,1,0,-1,1> const> const ,Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1> const ,0,Eigen::Stride<0,0>>> const>,Eigen::internal::assign_op<signed char,signed char>,Eigen::internal::Dense2Dense,void>::run(Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>> &) Line 855      C++
      [Inline Frame] onnxruntime.dll!Eigen::internal::call_assignment_no_alias(Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>> &) Line 797      C++
      [Inline Frame] onnxruntime.dll!Eigen::internal::call_assignment(Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>> &) Line 768   C++
      [Inline Frame] onnxruntime.dll!Eigen::internal::call_assignment(Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>> &) Line 750   C++
      [Inline Frame] onnxruntime.dll!Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>>>::operator=(const Eigen::DenseBase<Eigen::CwiseBinaryOp<Eigen::internal::scalar_quotient_op<signed char,signed char>,Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<signed char>,Eigen::Array<signed char,-1,1,0,-1,1> const> const ,Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1> const ,0,Eigen::Stride<0,0>>> const>> &) Line 59 C++
      [Inline Frame] onnxruntime.dll!onnxruntime::Div<signed char>::Compute::__l2::<lambda_998187df037dec36fd0905b4142c682e>::operator()(onnxruntime::BroadcastHelper &) Line 685   C++
      onnxruntime.dll!<lambda_998187df037dec36fd0905b4142c682e>::<lambda_invoker_cdecl>(onnxruntime::BroadcastHelper & per_iter_bh) Line 686    C++
      [External Code]   
      [Inline Frame] onnxruntime.dll!std::_Func_class<void,__int64,__int64>::operator()(__int64 <_Args_0>, __int64 <_Args_1>) Line 926    C++
      onnxruntime.dll!onnxruntime::concurrency::ThreadPool::ParallelFor(__int64 n, const onnxruntime::TensorOpCost & c, const std::function<void __cdecl(__int64,__int64)> & f) Line 628  C++
      onnxruntime.dll!onnxruntime::concurrency::ThreadPool::TryParallelFor(onnxruntime::concurrency::ThreadPool * tp, __int64 total, const onnxruntime::TensorOpCost & cost_per_unit, const std::function<void __cdecl(__int64,__int64)> & fn) Line 705     C++
      onnxruntime.dll!onnxruntime::ParallelizeSingleSpan<onnxruntime::BroadcastHelper>(onnxruntime::BroadcastHelper & helper, const onnxruntime::ProcessBroadcastSpanFuncs & functors) Line 955 C++
      onnxruntime.dll!onnxruntime::BroadcastLooper<onnxruntime::BroadcastHelper>(onnxruntime::BroadcastHelper & helper, const onnxruntime::ProcessBroadcastSpanFuncs & functors) Line 1006      C++
      onnxruntime.dll!onnxruntime::UntypedBroadcastTwo(onnxruntime::OpKernelContext & context, const onnxruntime::ProcessBroadcastSpanFuncs & funcs, double unit_cost, void * user_data) Line 2305    C++
      onnxruntime.dll!onnxruntime::Div<signed char>::Compute(onnxruntime::OpKernelContext * context) Line 695     C++
      

</issue_description>

Comments on the Issue (you are @copilot in this section)


📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.

For integral types, integer division by zero causes undefined behavior
in C++ and results in a hardware exception (SIGFPE on Linux, structured
exception on Windows). This adds a pre-check of the divisor tensor for
any zero values and returns an error Status instead of crashing.

Fixes: int8 / 0 exception not caught for cpu ep

Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix int8 / 0 exception not caught for cpu ep Fix integer division by zero crash in CPU EP Div operator Mar 17, 2026
Copilot AI requested a review from tianleiwu March 17, 2026 06:32
@guschmue
Copy link
Copy Markdown
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@guschmue guschmue marked this pull request as ready for review March 17, 2026 20:17
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 4 pipeline(s).

When the divisor (input 1) is a constant initializer, validate for
integer division by zero once in the constructor using
OpKernelInfo::TryGetConstantInput, and skip the per-Compute check.
Non-constant divisors still get checked at runtime.

Co-authored-by: skottmckay <979079+skottmckay@users.noreply.github.com>
Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
@guschmue
Copy link
Copy Markdown
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 4 pipeline(s).

@tianleiwu
Copy link
Copy Markdown
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 4 pipeline(s).

@tianleiwu tianleiwu merged commit 44e8b38 into main Mar 19, 2026
92 of 95 checks passed
@tianleiwu tianleiwu deleted the copilot/fix-int8-division-by-zero branch March 19, 2026 23:05
adrastogi pushed a commit that referenced this pull request Mar 20, 2026
### Description

Add a pre-check for zero values in the divisor tensor for integral types
in `Div<T>`. Returns an error `Status` instead of hitting undefined
behavior (SIGFPE / structured exception).

- **`element_wise_ops.h`**: When the divisor is a constant initializer,
`TryGetConstantInput` validates for zeros once at kernel creation time
in the constructor, avoiding per-`Compute` overhead. A
`divisor_is_validated_constant_` flag tracks whether the one-time check
was performed.
- **`element_wise_ops.cc`**: `if constexpr (std::is_integral<T>::value)`
guard scans non-constant divisors before calling `UntypedBroadcastTwo`,
skipping the check when the constant was already validated. Compiled
away for float/double/half — zero cost for non-integer paths.
- **`element_wise_ops_test.cc`**: Added `Div_int8_by_zero`,
`Div_int32_by_zero`, `Div_int64_by_zero_scalar` tests covering tensor
and scalar divisor cases, plus `Div_int32_by_zero_constant_initializer`
to exercise the `TryGetConstantInput` constructor path with
`is_initializer = true`.

### Motivation and Context

Integer division by zero is UB in C++ and causes a hardware exception
that crashes the process. Float types produce inf/NaN naturally, but
int8/int16/int32/int64/uint* types do not. This was reported via
Chromium (https://issues.chromium.org/issues/491835014) with a trivial
repro: `tensor<int8> / scalar(0)`.

<!-- START COPILOT ORIGINAL PROMPT -->



<details>

<summary>Original prompt</summary>

> 
> ----
> 
> *This section details on the original issue you should resolve*
> 
> <issue_title>int8 / 0 exception not caught for cpu ep</issue_title>
> <issue_description>See https://issues.chromium.org/issues/491835014.
> 
> Repro:
> a=tensor<int8>
> b=tensor<int8>, ie a scalar that is 0
> model that does a/b
> 
> Stack trace:
> ```
> onnxruntime.dll!Eigen::internal::scalar_quotient_op<signed char,signed
char>::operator()(const char &) Line 437      C++
>      [Inline Frame]
onnxruntime.dll!Eigen::internal::binary_evaluator<Eigen::CwiseBinaryOp<Eigen::internal::scalar_quotient_op<signed
char,signed
char>,Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<signed
char>,Eigen::Array<signed char,-1,1,0,-1,1> const> const
,Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>
const ,0,Eigen::Stride<0,0>>>
const>,Eigen::internal::IndexBased,Eigen::internal::IndexBased,signed
char,signed char>::coeff(__int64) Line 910    C++
>  ...
>      [Inline Frame]
onnxruntime.dll!Eigen::internal::Assignment<Eigen::Map<Eigen::Matrix<signed
char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>>,Eigen::CwiseBinaryOp<Eigen::internal::scalar_quotient_op<signed
char,signed
char>,Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<signed
char>,Eigen::Array<signed char,-1,1,0,-1,1> const> const
,Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>
const ,0,Eigen::Stride<0,0>>> const>,Eigen::internal::assign_op<signed
char,signed
char>,Eigen::internal::Dense2Dense,void>::run(Eigen::Map<Eigen::Matrix<signed
char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>> &) Line 855      C++
>      [Inline Frame]
onnxruntime.dll!Eigen::internal::call_assignment_no_alias(Eigen::Map<Eigen::Matrix<signed
char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>> &) Line 797      C++
>      [Inline Frame]
onnxruntime.dll!Eigen::internal::call_assignment(Eigen::Map<Eigen::Matrix<signed
char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>> &) Line 768   C++
>      [Inline Frame]
onnxruntime.dll!Eigen::internal::call_assignment(Eigen::Map<Eigen::Matrix<signed
char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>> &) Line 750   C++
>      [Inline Frame]
onnxruntime.dll!Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<signed
char,-1,1,0,-1,1>,0,Eigen::Stride<0,0>>>::operator=(const
Eigen::DenseBase<Eigen::CwiseBinaryOp<Eigen::internal::scalar_quotient_op<signed
char,signed
char>,Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<signed
char>,Eigen::Array<signed char,-1,1,0,-1,1> const> const
,Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<signed char,-1,1,0,-1,1>
const ,0,Eigen::Stride<0,0>>> const>> &) Line 59 C++
>      [Inline Frame] onnxruntime.dll!onnxruntime::Div<signed
char>::Compute::__l2::<lambda_998187df037dec36fd0905b4142c682e>::operator()(onnxruntime::BroadcastHelper
&) Line 685   C++
>
     onnxruntime.dll!<lambda_998187df037dec36fd0905b4142c682e>::<lambda_invoker_cdecl>(onnxruntime::BroadcastHelper
& per_iter_bh) Line 686    C++
>       [External Code]   
>      [Inline Frame]
onnxruntime.dll!std::_Func_class<void,__int64,__int64>::operator()(__int64
<_Args_0>, __int64 <_Args_1>) Line 926    C++
>
     onnxruntime.dll!onnxruntime::concurrency::ThreadPool::ParallelFor(__int64
n, const onnxruntime::TensorOpCost & c, const std::function<void
__cdecl(__int64,__int64)> & f) Line 628  C++
>
     onnxruntime.dll!onnxruntime::concurrency::ThreadPool::TryParallelFor(onnxruntime::concurrency::ThreadPool
* tp, __int64 total, const onnxruntime::TensorOpCost & cost_per_unit,
const std::function<void __cdecl(__int64,__int64)> & fn) Line
705     C++
>
     onnxruntime.dll!onnxruntime::ParallelizeSingleSpan<onnxruntime::BroadcastHelper>(onnxruntime::BroadcastHelper
& helper, const onnxruntime::ProcessBroadcastSpanFuncs & functors) Line
955 C++
>
     onnxruntime.dll!onnxruntime::BroadcastLooper<onnxruntime::BroadcastHelper>(onnxruntime::BroadcastHelper
& helper, const onnxruntime::ProcessBroadcastSpanFuncs & functors) Line
1006      C++
>
     onnxruntime.dll!onnxruntime::UntypedBroadcastTwo(onnxruntime::OpKernelContext
& context, const onnxruntime::ProcessBroadcastSpanFuncs & funcs, double
unit_cost, void * user_data) Line 2305    C++
>      onnxruntime.dll!onnxruntime::Div<signed
char>::Compute(onnxruntime::OpKernelContext * context) Line 695     C++
>       
> ```
> </issue_description>
> 
> ## Comments on the Issue (you are @copilot in this section)
> 
> <comments>
> </comments>
> 


</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes #27686

<!-- START COPILOT CODING AGENT TIPS -->
---

📱 Kick off Copilot coding agent tasks wherever you are with [GitHub
Mobile](https://gh.io/cca-mobile-docs), available on iOS and Android.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Co-authored-by: skottmckay <979079+skottmckay@users.noreply.github.com>
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
hariharans29 pushed a commit that referenced this pull request Mar 30, 2026
### Description

Add a pre-check for zero values in the divisor tensor for integral types
in `Mod`. Returns an error `Status` instead of hitting undefined
behavior (SIGFPE / structured exception).

- **`element_wise_ops.cc`**: Added `CheckZeroDivisorImpl` as a single
template struct in the `mod_internal` namespace using `if constexpr
(std::is_integral<T>::value)` to guard the check — no-op for non-integer
types. The struct's `operator()` returns `Status` (via `ORT_RETURN_IF`)
and is dispatched with `InvokeRet<Status>`. When the divisor is a
constant initializer, `TryGetConstantInput` validates for zeros once at
kernel creation time in the out-of-line constructor (using
`ORT_THROW_IF_ERROR`), avoiding per-`Compute` overhead. A
`divisor_is_validated_constant_` flag tracks whether the one-time check
was performed. In `Compute`, non-constant divisors are scanned via the
type dispatcher (using `ORT_RETURN_IF_ERROR`) before calling
`CallModImpl`, skipping the check when the constant was already
validated. The Mod constructor is defined out-of-line after the
`mod_internal` namespace to keep it contiguous.
- **`element_wise_ops_test.cc`**: Added `Mod_int8_by_zero`,
`Mod_int32_by_zero`, `Mod_int64_by_zero_scalar` tests covering tensor
and scalar divisor cases, plus `Mod_int32_by_zero_constant_initializer`
to exercise the `TryGetConstantInput` constructor path with
`is_initializer = true`.

### Motivation and Context

Integer modulo by zero is UB in C++ and causes a hardware exception that
crashes the process. Float types produce NaN naturally via `std::fmod`,
but int8/int16/int32/int64/uint* types do not. This is the same class of
issue that was fixed for the `Div` operator in #27693, now applied to
the `Mod` operator.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 Send tasks to Copilot coding agent from
[Slack](https://gh.io/cca-slack-docs) and
[Teams](https://gh.io/cca-teams-docs) to turn conversations into code.
Copilot posts an update in your thread when it's finished.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: edgchen1 <18449977+edgchen1@users.noreply.github.com>
adrastogi added a commit that referenced this pull request Mar 31, 2026
This cherry-picks the following commits for the release:

- #26434 [VitisAI]add tensor type bool
- #26452 [VitisAI EP] Fix error in graph resolving
- #26487 [VitisAI] Enable ort::logger usage in
compile_onnx_model_vitisai_ep_v4
- #26519 [VitisAI] Remove unused function body handling in graph fusion
- #26627 [VitisAI] Add External EP Loader
- #26699 [VitisAI] Add support compiled model compatibility information
retrieval and validation
- #27295 Remove s_kernel_registry_vitisaiep.reset() in
deinitialize_vitisai_ep()
- #27356 Add/Update telemetry events
- #27626 Add PE version info to onnxruntime_providers_vitisai.dll
- #27693 Fix integer division by zero crash in CPU EP Div operator
- #27815 Fix overflow in DmlGraphFusionHelper::ProcessInputData
- #27823 Fix new-delete mismatch in DML EP's QuantizeLinear operator

---------

Co-authored-by: Yueqing Zhang <yuz75@Pitt.edu>
Co-authored-by: Yueqing Zhang <yueqingz@amd.com>
Co-authored-by: zpye <yezupei92@foxmail.com>
Co-authored-by: Chunye Wang@AMD <chunywan@amd.com>
Co-authored-by: mingyue <131847423+mingyueliuh@users.noreply.github.com>
Co-authored-by: zz002 <zhenzew@amd.com>
Co-authored-by: Darshak Bhatti <47045043+dabhattimsft@users.noreply.github.com>
Co-authored-by: Darshak Bhatti <dabhatti@micorsoft.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Xiaoxi Han <xiha@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Co-authored-by: skottmckay <979079+skottmckay@users.noreply.github.com>
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

int8 / 0 exception not caught for cpu ep

4 participants